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ABSTRACT 



The data storage system is connected to a local area network 
and includes a storage server that on a demand basis and/or 
on a periodically scheduled basis audits the activity on each 
volume of each data storage device that is connected to the 
network. Low priority data files are migrated via the network 
and the storage server to backend data storage media, and 
the directory resident in the data storage device is updated 
with a placeholder entry to indicate that this data file has 
been migrated to backend storage. When the processor 
requests this data file, the placeholder entry enables the 
storage server to recall the requested data file to the data 
storage device from which it originated. 

25 Claims, 9 Drawing Sheets 
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DATA STORAGE MANAGEMENT FOR the file server directory, then the archived files directory in 

NETWORK INTERCONNECTED response to a host processor request for an archived data file. 

PROCESSORS This recursive search process is wasteful of processing 

resources. 

CROSS-REFERENCE TO RELATED 5 prior art data storage systems include one described in 

APPLICATION u.S. Pat. No. 5,367,698, which describes a networked file 

This application is a divisional of a patent application migration system for a plurality of file servers. The file 
entitled "Data Storage Management For Network Intercon- se ? ers m «" te * e me f io J badcend data storage system 
nected Processors," Sen No. 08/201,658 filed Feb. 25, 1994, in r< ; cord mc addrcss ° f ^ s rcloc a ted 1 « a 

now U S Pat No 5 537 585 special migrated file directory which is stored in the file 

server address space so these files can be directly addressed 
FIELD OF THE INVENTION v * a * ne networ ^ Th e data files are managed and migrated on 

an individual file basis. 

This invention relates to data communication networks, Another data storage system is disclosed in U.S. Pat. No. 
such as local area networks, that function to interconnect a 15 5j2 76,867, which uses a hierarchical storage system having 
plurality of data processors with data storage subsystems, mree levels t0 migral e data from the online storage to ever 
and to a data storage management system that automatically grcat cr capacity and lesser speed storage devices as the data 
migrates low priority data files from the data storage sub- ages data files are managed migrated on an 
systems to backend data storage to provide more available individual file basis, 
data storage space in the data storage subsystems. 20 



PROBLEM 



SOLUTION 



The above -described problems are solved and a technical 
It is a problem in the field of local area networks to advance achieved in the field by the data storage manage- 
provide both adequate data storage resources for the pro- ^ ment sys tem of the present invention. The data storage 
cessors connected to the network as well as efficient data management system is connected to the network and pro- 
storage management capability associated with the data vides a hierarchical data storage capability to migrate lower 
storage subsystems that are connected to the network and priority data files from the data storage subsystems that are 
which serve the processors. Existing local area networks connected to the network to backend less expensive data 
interconnect a plurality of processors with a number of data 3Q st0 rage media, such as optical disks or magnetic tape. A data 
storage devices, also termed data storage subsystems, on st0 rage management capability is also included to provide 
which are stored the data files used by the processors. The automated disaster recovery data backup and data space 
term data files is used to characterize the various data that management capability. In particular, a placeholder entry is 
can be stored on memory devices and includes data managed inserted into the directory entry in the managed file server 
by file servers, databases, application servers, and note 35 vo^c f or ea ch migrated data file. The placeholder entry 
systems, which systems are collectively termed "file serv- botn indicates the migrated status of the data file and 
ers" herein. Typically, the data storage subsystems are provides a pointer to enable the requesting processor to 
individual magnetic disk drives or disk drive array data efficiently locate and retrieve the requested data file, 
storage subsystems. rp^ e j ata stora g e management system implements a vir- 
A problem with this network configuration is that these 40 tual data storage system, comprising a plurality of virtual file 
data storage subsystems are very expensive. A significant systems, for the processors that are connected to the net- 
portion of the data that is stored thereon is little used and work. The virtual data storage system consists of a first 
cannot justify the use of expensive data storage media. In the section that comprises a plurality of data storage 
corresponding area of data storage management, there is subsystems, each consisting of file servers and their associ- 
typically no management of the data files that are stored on 45 a ted data storage devices, which are connected to the net- 
these data storage subsystems that are directly connected to wor k serve the processors. A second section of the 
the network. Adata storage management activity is typically virtual data storage system comprises the storage server, 
initiated only in response to a processor encountering inad- consisting of a storage server processor and at least one layer 
equate available data storage space on the data storage 0 f hierarchically arranged data storage devices, that provides 
subsystems. At this point, a user typically manually deletes 50 backend data storage space. The storage server processor 
various unused or little used data files or manually rewrites interfaces to software components stored in each processor 
these data files to another media, such as magnetic tape, that and n [ e server that is connected to the network. The storage 
can be placed in archive storage for availability at a later server, on a demand basis and/or on a periodically scheduled 
time. This data storage management philosophy is highly basis, audits the activity on each volume of each data storage 
inefficient in that data processing operations must cease 55 device that is connected to the network. Data files that are of 
while a user manually removes data files from the data i ower priority are migrated via the network and the storage 
storage subsystem to obtain additional data storage space. server to backend data storage media. The data file directory 
This form of manual data storage space allocation is inef- resident in the data storage device that originally contained 
ficient since some of the data files that are deleted or m js data file is updated with a placeholder entry in the 
archived may not be the best candidates for such processing. 60 directory to indicate that this data file has been migrated to 
Furthermore, the data storage media remains unmanaged backend data storage. Therefore, when a processor requests 
between these randomly occurring spurts of data manage- w js data file, the placeholder entry is retrieved from the 
ment activity. directory and the storage server is notified that the requested 
In addition, the retrieval of archived data files is cumber- data file has been migrated to backend storage and must be 
some since the identification of archived data files is typi- 65 recalled to the data storage device from which it originated, 
cally expunged from the file server and listed in a separate The storage server automatically retrieves the requested data 
archived files directory. Thus, the file server must first scan file using information stored in the placeholder entry and 
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transmits the retrieved data file to the data storage device FIG. 10 illustrates in block diagram form various com- 

from whence it originally came. The storage server, backend ponents of the hierarchical storage manager software; 

data storage and processor resident software modules create FIGS. 11 and 12 illustrate two embodiments of data 

a virtual storage capacity for each of the data storage devices transfer units used in data migration in the secondary 

in a manner that is transparent to both the processor and the 5 storage; and 

user. Each virtual volume in this system can be expanded in FIG.13 illustrates a typical directory structure used by a 

extent in a seamless manner to match the needs of the ^ e svstem 
processor by using low cost mass storage devices. 

In operation, the storage server monitors the amount of DETAILED DESCRIPTION 

available data storage space on each of the volumes 10 , , . . . , , 
/4 , , x i_ c iL j * * j* * Local area networks are increasingly becoming an inte- 
rnetwork volumes) on each of the data storage devices to - « » » . • 6 / . 

t , , fo _ A ■ rt „tii„ui* iu gral feature in the business environment, FIG. 1 illustrates in 

ensure that adequate data storage space is available to the P, , c iL „ ... ^ e . . , « 

processors on a continuing basis. When the available data block d " gra ™ form 'he overall architecture of a typ.ca local 

storage space drops below a predetermined threshold, the a,ea netwo * 1 the incorporaUon of toe data storage 

storage server reviews the activity levels of the various data 15 management system of the present invention mto the local 

files mat are stored therein and automatically migrates the a,ea netwo * \ K J%* f ea netwo / k > ~ns.sU of data 

lower priority dau files to the backend data storage as communication hnk 11 and software (not shown) that inter- 

j j i ^ 4U u i j j * . • connects a plurality of processors 21, 22 with a number of 

described above. Furthermore, the backend data storage is „. r A + ' ' 

similarly managed with the lower priority data files being flle se , rvers ^f* 3 / ,. The Processors can be personal 

migrated from layer to layer within the multi-layer hierar- M computers, work stations, mm.-computers or any other 

. . , j . , e #u • . • % i i processing element. For the simplicity of description, all of 

chical data storage as a function of their activity level, * , ? , , / 7 . ,f * 

. . j iL < c these devices are described by the generic term processor . 

content and the amount or available data storage space on .... ... - . -« 

* c , , c « ■ . • Whue many of these processors 21, 22 may contain a 

these various layers. Therefore, each layer of the hierarchi- . . £ 4 J 4 - , f A * 

cal storage is populated by data files whose usage pattern significant amount of da a storage capacity, it is not . uncom- 

and priority is appropriate to that layer or type of media. The 25 ™» ^ f ^ area nelwo * * 10 * equipped with addi- 

data storage devices can be viewed as comorisine a first Uonal data storagc ca P acit y to supplement that of the pro- 

aata storage aevices can oe viewea as comprising a nrsi cessors r\ 22 themselves The data storatre devices 31-33 

layer of this data storage hierarchy while a backend disk ^f 0 * * l > LL , Yf^Z a ) f , -T^ \ ^ 

a-;„~ a vi, a^„„ n „~7, „ n „ u _ „ ' . 1 rtf • Antn that are connected to the data communication link 11 of the 

drive or disk drive array can be a second layer of this data *. * * * • « u • 1. j a 

, „ ' . , r.u u- u fj ( local area network 1 are typically high-speed random access 

storage hierarchy. Successive layers of this hierarchy of data , . , « .-•■if- j- 1 j • 

storage devices can incorporate optical disks, and/or mag- 30 devices such as high capac, y disk drives or even disk drive 

netic tape, and/or automated media storage and retrieval arra ^. to thereby substantially be compatible with the 

libraries, and/or manual media storage and retrieval librar- °P eratln S s P 6cd t ' \ ™ « 

communication link U. Each data storage device 31-33 is 

_ , . „ , , . . included in a file server 41, work station 42 or other type of 

When a data file is recalled by the storage server it is seryer 43 which ^ an mterface between , he 

transmitted from its backend data storage location directly to netwofk t md , he daU & e devke 31 _ 33j such 

as a disk 

a data storage device where it is accessed by the requesting drfve Fof si licit of description, the data storage capacity 

processor. The data file remains on this data storage device ^ b ^ ^ 41-43 ^ its associated data 

until it is migrated to backend storage as a function of the stof dcvicc 31 _ 33 ^ to M « filc hwin 

normal audit and migration procedures of the storage server. _ . ^ iL t . i , 4 . , , 

40 Each processor 21 that is connected to the local area 

BRIEF DESCRIPTION OF THE DRAWING network 1 is typically capable of accessing at least one 

FIG. 1 illustrates in block diagram form the overall volume on one of these file servers 41 as directly accessible 

architecture of a typical local area network that includes the additional data storage space for the use of this processor 21 

data storage management system of the present invention; to store data mcs - ^ tcrm data mcs is uscd to characterize 

p, p - t , • ui 1 j* c *u • 45 the various data that can be stored on data storage devices 

FIG. 2 illustrates in block diagram form the various * J < . . t JL C1 Jit _ i* 

network software components; ^ lncludes da,a mana 8 ed b y file databases, appli- 

. , . , , . r cation servers, and note systems, which are collectively 

FIG. 3 illustrates in conceptual view the architecture ot , , . _ ^ u » iU . _ ^ , , 

A . . . ... - f. - 4 4 referred to as file servers herein. In this system, the local 

the hierarchical memory of the data storage management afea Qetwork 2 ides a communication f / btic over which 

system of the present invention; ^ processors 21> %% and me flle servers ^ ^^^^ 

FIG. 4 illustrates a physical implementation of the hier- via a predetermined protocol. The disclosed configuration 

archical memory of the data storage management system of ^ i m p lem entation of the local area network 1 and its 

the present invention; protocol, processors 21, 22, file servers 41-43 as described 

FIG. S illustrates in block diagram form the data file herein are simply illustrative of the invention and there are 

migration and backup paths taken in the data storage man- 55 numcr ous alternate embodiments of this system that are 

agement system; possible. 

FIG. 6 illustrates in flow diagram form the operational In addition to the processors 21, 22 and the file servers 

steps taken by the apparatus of the present invention to 41 _4 3j thc data storagc mana gement system of the present 

perform a routine sweep operation; invention includes the data storage management apparatus 

FIG. 7 illustrates in block diagram form the data file recall 60 connected to the local area network 1. This data storage 

path taken in the data storage management system; management apparatus comprises a storage server 50 that is 

FIG. 8 illustrates in flow diagram form the operational connected to the local area network 1. A storage server 

steps taken by the apparatus of the present invention to processor 51 serves to interface the local area network 1 with 

perform a data file recall operation; the backend data storage devices 61-65 (FIG. 4) that con- 

F1G. 9 illustrates in graphical form the data storage 65 stitute the secondary storage 52. The backend data storage 

management processes of the present invention on a time- devices 61-65, in combination with the file servers 41-43 

wise basis; comprise a hierarchical data storage system. The backend 
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data storage devices 61-65 typically include at least one control 213, data storage management 214, communications 

layer of data storage that is less costly than the dedicated 215, data file directory 216. 

data storage devices 31-33 of the file servers 41-43 to The data storage management system of the present 

provide a more cost-effective data storage capacity for the invention includes data storage devices shown in FIG. 1 as 

processors 21, 22. The data storage management system 5 well as data storage management software 214 that is 

implements a virtual data storage space for the processors incorporated into the network software. The data storage 

21, 22 that are connected to the local area network 1. The management software 214 includes a plurality of modules, 

virtual data storage space consists of a first section A that eacn of which provide a specific function in the general data 

comprises a primary data storage device 31 that is connected storage management task. The modules illustrated in FIG. 2 

to the network 1 and used by processors 21, 22. A second 10 are: disa * ter recovery facility 221, object access manage- 

section B of the virtual memory comprises the secondary ™ nt hcih ^ ^ and hierarchical storage management 223. 

storage 52 managed by the storage server processor 51. The ^ m ° duIcs r ^* cnt ™ m % typical features that are 

a * m * j AAt - ! j t , provided to users of the network to enable them to obtain 

secondary storage 52 prov.de* addiUonal data storage capac- ^proved data storage efficiency. Within each module there 

ity for each of die primary data storage devices 31-33 J be a numbef * ^ dona / processes tnal are inco 

represented on FIG. 1 as the virtual devices 31S-33S 15 rated into ^ cat of ^ listed modllle 
attached m phantom to the primary data storage devices 

31-33 of the file servers 41-43. Processor 21 is thereby Hierarchical Storage Management Architecture 

presented with the image of a greater capacity data storage FIG. 3 illustrates the philosophical architecture and FIG. 

device 31 than is connected to the file server 41. The storage 4 illustrates one possible hardware implementation of the 

server 51 interfaces to software components stored in each 20 hierarchical data storage management system. The user at a 

processor 21, 22 and file server 41-43 that is connected to processor 21 interfaces with a primary data storage device P 

the local area network 1. The storage server processor 51, on via tne network 1. The primary storage device P consists of 

a demand basis and/or on a periodically scheduled basis, a file server 41 and its associated data storage device(s) 31, 

audits the activity on each volume of each data storage such as a disk drive ' ^ filc server 41 manages the data 

device 31-33 of the file servers 4M3 that are connected to is stora p medl > °J teassochted data storage device 31 in 

the network 1. Data files that are of lower priority are ™* }™ ovm fashlon * ^ data f ora S e de ™ e 31 » typically 

migrated via the network 1 and the storage server processor mt ° a ™ b « of v 1 olumcs ' wmch be ^ed 

51 to backend data storage media of the secondary storage network volumes^ Addihonal volumes are provided by the 

*i t^u a * ci j* « 'a 4 ' tu. at aa 1 * assignment of additional volumes in the same data storage 

52. The data file directory resident m the file server 41 that a n „. »u„ .aa^ r a ^ a • 7 

. . „ » j * j * i • * . * t device 31 or the addition of further data storage devices to 

originally contained this data file is updated with a place- 30 me n^v^k \ 

holder entry in the Rectory to indicate that this data file has M ilhlstrated in FIG 3 the st 52 is 

been migrated to backend data storage. Therefore, when the imo at least one and more likel a plurality * f layers 

processor 21 requests this data file, the placeholder entry is 31 i_ 313j generally as a function of the media used to 

retrieved from the directory and the storage server processor implement the data storage devices 61-65. In particular, the 

51 is notified that the requested data file has been migrated 35 sccond i aycr 3 n of the hierarchical data storage, which is 

to backend storage and must be recalled to the file server 41 the first layer of the secondary storage 52, can be imple- 

from which it originated. In the case of a processor 21, 22 mented by high speed magnetic storage devices 61. Such 

and 42 that interfaces to a user, the storage server 50 may devices include disk drives and disk drive arrays. The third 

provide the user with a notification where necessary that a layer 312 of the hierarchical data storage, which is the 

time delay may be noted in accessing the requested data file. 40 second layer of the secondary storage 52, can be imple- 

The storage server processor 51 automatically retrieves the mented by optical storage devices 62. Such devices include 

requested data file and transmits it to the data storage device optical disk drives and robotic media storage and retrieval 

31 from whence it originally came. The storage server library systems. The fourth layer 313 of the hierarchical data 

processor 51, secondary storage 52 and processor resident storage, which is the third layer of the secondary storage 52, 

software modules create a virtual storage capacity for each 45 can be implemented by slow speed magnetic storage devices 

of the file servers 41-43 in a manner that is transparent to 53. Such devices include magnetic tape drives and robotic 

both the processor 21, 22 and the user. Each virtual volume media storage and retrieval library systems. An additional 

in this system can be expanded in extent in seamless manner layer 314 of the hierarchical data storage can be imple- 

to match the needs of the processors 21, 22 by using low cost mented by the use of a "shelf layer", which can be imple- 

mass storage devices to implement the secondary storage 52. 50 mented by manual storage of media. This disclosed hierar- 

Network Software cnV ^ ^ m P^Y illustrative of the data storage management 

concept and the number, order and implementation of the 

FIG. 2 illustrates in block diagram form the typical various layers can differ from that disclosed herein, 

components of the network software, including the data M ^ be ^ ^ mQ 3> dala files can m[ from ^ 

storage management software of the present invention. 55 filc scrvcr volumcs of mc ^ ^Hon A of the virtual 

There are a number of network servers presently available m t0 me data e devic£S ^ of ^ geoond 

on the market, with the Novell NetWare® software repre- scctioQ B of ^ mc [n Mi{oUf thcsc dala filcs 

senting the dominant product in this market Tlie foUowing can be relocated from ^ fifst } m of ^ 

description is therefore couched in terms of a NetWare® sccondary storagc 52 to the second 312 and third layers 313 

embodiment for simplicity of description, although the 60 0 f the secondary storage 52 as a function of the activity of 

invention is not limited to this embodiment ^ data ^ ^ {n mQ 3 ^ da(a 

The network software includes an operating system 211 be recalled directly to the file server volumes from any layer 

which functions to provide the basic network framework. In 0 f me secondary storage 52. 
addition, a plurality of modules are provided to support the 

various functions that are essential to the functioning of the 65 Shelf Layer 

processors that are connected to the network. These modules As data files are transmitted to the storage server 51 for 

include, but are not limited to: file management 212, print migration to secondary storage 52, they are automatically 
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protected from loss in several ways. The data storage 
devices 61 in the first layer 311 of the second section of the 
virtual data storage system are typically protected by the use 
of shadow copies, wherein each data storage device 61 and 
its contents arc replicated by another data storage device 65 
and its contents. In addition, as data files are migrated to the 
storage server 51 for retention, they are packaged into large 
blocks of data called transfer units. The transfer units are 
backed up via a backup drive 71 on to a separate backup 
media 72, such as high density magnetic tape media. Mul- 
tiple copies of this backup media 72 may be created to 
provide both off-site and on-site copies for data security. A 
backup media rotation scheme can be implemented to rotate 
the backup media between a plurality of locations, typically 
between an on-site and an off-site location to protect against 
any physical disasters, such as fire. When the lowest layer 

313 of the second section of the virtual data storage space 
becomes nearly full, the data storage devices 63 that com- 
prise this layer are reviewed to identify the lowest priority 
transfer units contained thereon. These identified transfer 
units are deleted from this layer and the secondary storage 
directories are updated to indicate that the data files con- 
tained in these deleted transfer units have been "relocated" 
to the shelf layer 314. No physical movement of the transfer 
units or the data files contained therein takes place. The 
relocation is virtual, since the data files are presently stored 
on backup media 72 that was created when these identified 
data files were initially migrated to the first layer of the 
secondary storage. The placeholder entry for each of the data 
files contained in the deleted transfer units is not updated, 
since the data files are still accessible within the data storage 
system. The secondary storage directories are updated to 
note that the data files are presently stored on the shelf layer 

314 and the identity of the media element 72 that contains 
this data file is added to the directory entry for this data file. 
This shelf storage concept is very convenient for temporary 
overflow situations where free space is required at the lowest 
layer 313 of the hierarchy but the user has not procured 
additional data storage devices 63. Where the user subse- 
quently does expand the data storage capacity of this layer, 
the overflowed data can be automatically retrieved from the 
shelf storage and placed in the additional data storage space. 

When a processor 21 requests access to a data file that is 
stored in the shelf layer 314, the storage server 51 retrieves 
the physical storage location data from the secondary stor- 
age directory associated with the requested data file. This 
data includes an identification of the media element 72 that 
contains the requested data file. The physical location of this 
media element 72 is dependent on the data read/write 
activity and configuration of the system. It is not unusual for 
the identified media element 72 to be mounted on the backup 
drive 71 that performs the data file backup function. If so, 
the data file is retrieved from this backup drive 71. If the 
media element 72 has been removed from the backup drive 
71, an operator must retrieve the removed media element 72 
and mount this media element on a drive 71 to enable the 
storage server 51 to recall the requested data file from the 
media element 72 and transmit the data file to the file server 
31 used by the requesting processor 21. The retrieved media 
element 72 can be mounted on the backup drive 71 or a 
separate drive can op tio Dally be provided for this purpose to 
enable the storage server 51 to continually backup data files 
as they are migrated to secondary storage 52. Thus, the 
backup media 72 serves two purposes: backup of data files, 
and shelf layer 314 of storage in the data storage hierarchy. 

Retirement Layer 

When data files have not been utilized for an extended 
period of time, they should be removed from the virtual data 
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storage system and placed in another managed data storage 
system that does not utilize the more expensive automatic 
resources of the virtual data storage system. It is advanta- 
geous to track these retired data files in the event that they 
need to be retrieved. The retirement layer 315 performs this 
function. When a data file is retired, it no longer is part of 
the virtual data storage system and its placeholder entry is 
deleted from the primary storage directory. In addition, the 
identification of the data file and any other properties that 
were recorded in the secondary storage directory are saved 
and placed in a separate retirement directory. The retired 
file's placeholder entry, secondary storage directory entry 
and backup directory entry are deleted. 

To simplify the management of the retirement directory, it 
can be partitioned into segments, each of which lists data 
files that were last accessed during a designated time period. 
The structure of the retirement directory can follow the 
scheme of the underlying virtual file system directory 
structure, such as a basic tree structure. The virtual file 
system usually starts at the volume level of the tree, but the 
directory structure of the retirement directory can be 
expanded upward to include servers within a defined 
domain. The domain can represent divisions of a 
corporation, or any other segmentation of the data files that 
is conceptually higher than the server level. This expansion 
enables the storage server 51 to distribute the retirement 
directory across the local area network 1 for storage by file 
server 41-43. Any tree searches for a retired data file can 
then be concurrently performed by the plurality of file 
servers 41-43. Data files are typically retired as a group that 
constitutes the oldest transfer unit(s) that may be on the 
oldest media in the data storage hierarchy, or oldest transfer 
unit(s) in a given virtual file system, if the hierarchy is 
organized by virtual file system. The data file retirement 
process examines the time of last access for each data file 
that is retired and places an entry in the retirement directory 
that corresponds to this temporal partition. Thus, each retire- 
ment directory segment is a journal of retired data files over 
a last accessed interval and also organized by domain. Each 
domain has a tree structure for its directory which can be 
parsed by file server 41-43 or volume and distributed over 
the local area network 1 to the corresponding file server. 



Data Management System Software 

The data management system software of the present 
invention manages the flow of data files throughout the 
system. The block diagram of FIG. 10 illustrates a concep- 
tual client-server view of the network and the data manage- 
ment system software. The data communication link 11 of 
the local area network 1 is illustrated having the storage 
server processor 51 and three file systems 41-43 attached 
thereto. The storage server processor 51 includes the net- 
work operating system HI as well as the data storage 
management system software consisting of various media 
and device management user interfaces 112 and control and 
services software 113. Each file server 41-43 includes a 
storage server agent 121-123 and any processor of the 
network can include and run an administrative user interface 
131. The control and services software 113 looks at the 
system as a set of clients that are connected to the network 
1 and which require services from the storage server 50. 
Each file server 41-43 communicates with the storage server 
processor 51 via the resident storage server agent software 
121-123. Thus, the data management system software is 
distributed throughout the network and serves to transpar- 
ently integrate all the elements connected to the network into 
the data storage hierarchy. 
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The storage server agent 121-123 represents a component selected network volume to secondary storage 52 than 

that is installed in each file server 41-43 in the local area required to reach the optimal level, these additional data files 

network 1 and functions to redirect requests for migrated are "pre-migrated" to secondary storage 52. The pre- 

data files from the file server 41-43 which was the original migration of data files entails migrating the data files to 

repository of the requested data file to the storage server 50. s secondary storage 52 but not deleting (truncating) the data 

The storage server agent 121-123 provides whatever inter- files from we network volume. The pre-migrated data files 

faces are required to redirect data file access from the file are marked as pre-migrated in the file system directory to 

server 41-43 to the storage server processor 51 and second- md ^ cate that data files e £ st in bolh the nctwork volume 

ary storage 52. In the case of a processor 21, 22, 42 that ™ d tn< ; secondar y storage 52. 

interfaces to a user, the storage server 50 may provide the 10 dunn S the **y,i s network volume tends to fill 

user with a notification that a time delay may be noted in *f a fi £ expansion, data file copying and newly created 

accessing the requested data file. Thus, the storage server data filcs ' ™ c s P a ~ task ofjhc hierarchical data storage 

* i<fi ill u v. a. * • * -i j * *i_ management application continually monitors the level of 

agent 121-123 has a personality that is tailored to tfie cou ^ 6 vol £ me space uti iizatiorL When a volume utili- 

underlymg client file ^server platform .or environment. For ^ mreshold ^ cxcecdcd between routine sweep 

example, where the file server is a database management 15 operationS) the _ ^ mitiates one of me space man . 

server, the storage server agent interfaces with the database agemcnt proccdurcs to reducc me vohime spacc utilization 

management system object manager to allow automatic to the next lowest threshold. For example, when the level of 

migration and recall of database objects, which can be volume utilization is between the acceptable and critical 

viewed as sub-files. Another example is the NetWare® file levels, the space task begins to truncate pre-migrated data 

system access manager which traps any NetWare® sup- 20 files until the level of volume utilization is reduced below 

ported file system calls at the file server. This also allows the me acceptable level. The pre-migration of data files thereby 

automatic recall of migrated data files to be triggered. enables the data storage management system to instantly 

Using these basic elements, numerous variations of the provide additional data storage space when the level of 

local area network 1 can be configured, having multiple volume utilization is excessive. Similarly, when the level of 

processors 21, 22 and multiple file servers 41-43, each with 25 volume utilization exceeds the critical level, the critical 

their attached data storage devices 31-33. The processor 51 migrate job is scheduled for immediate execution and func- 

on which the storage server software runs includes a physi- tions to move the lowest priority data files to secondary 

cal interface to the data communication link 11 of the local storage until the acceptable level has been reached, 

area network 1. The data file migration processes can be configured in 

Real Time Network Storage Space Management 30 various , wa Y s f° customize the space management task. In 

mrt n .„ , , „ , particular, while the sweep process is normally activated 

FIG. 9 illustrates a chart of configured volume space dufin times of lowefit Qetwofk activit the sw ss 

utilization over time for a typical network volume in the can be continuallv opera tional as a background procedure, 
primary storage. As can be seen from this chart, the level of ^ ±Q level of s activit bem cont^n^ (o suit the 
network volume space utilization vanes over time as a 35 space management requirements. Ibus, the sweep operaUon 
function of the actions of the data storage management caQ mclude ^ « acce lerator" capability. In addition, the 
system of the present invention. An unmanaged network sweep operation can be acuVat ed upon the completion of the 
volume suffers from monotonically increasing space utik- demand migration process or ^ critical migraU on process 
zation. When a configured network volume becomes t0 b rmg the level of volume utmzadon down to me optimal 
overutihzed, the user previously had to manually remove 40 lcvcl ^ swccp opcration can ^so bc concurrently opera- 
sufficient data files from the network volume to obtain donal ^ ^ data ffle recall operation since me system is 
adequate data storage space for use of the processor. The a mu i tiproccss sys tem. 
chart of FIG. 9 includes several predefined space utilization 

levels. These levels are listed as "critical", "acceptable", Routine Sweep Operation 
"optimal". The data storage management system activates 45 FIG. 5 illustrates the various paths used in a data file 
various procedures within the hierarchical data storage man- migration operation while FIG. 6 illustrates in flow diagram 
agement application as a function of the level of configured form the operational steps taken by the data storage man- 
volume space utilization. Various peaks of the curve are agement application to perform the routine sweep operation, 
designated by the name of the procedure that is activated at The sweep operation is activated on a routine basis, such as 
that time to reduce volume space utilization. 50 at a predetermined time each night As illustrated in FIG. 10, 
Forexample, "sweep" is a datastorage space management each client application program (such as DOS®, 
procedure that is initiated on a routine basis. The sweep Windows™, NetWare®File Server) is provided with a stor- 
procedure is typically initiated at a predetermined time each age service agent module 121-123 whose personality is 
day and runs to reduce the configured volume space utili- tailored to match the underlying client platform. In addition, 
zation to a level below that labeled as optimal on the chart 55 an administrative user interface 131 is provided to imple- 
of FIG. 9. The sweep procedure migrates the lowest priority ment the following software modules: storage manager, 
data files from the network volume to the media of the media manager, device manager, backup manager. The 
secondary storage 52 to ensure that there is an adequate storage manager provides general job, configuration, setup 
quantity of available data storage space on the network and system information monitoring functions. The media 
volume each day as operations are initiated by the users of 60 manager provides media-specific operations such as copy 
the various processors 21, 22 that are connected to the media and restore media. The device manager provides 
network 1. The space management procedures can include a device specific operations such as add a device and delete a 
plurality of concurrently operational space management device. The backup manager provides backup operations, 
rules. Thus, data files can be selected for migration as a such as definition of the number of backup sets, rotation 
function of the time of last access, size, quantity of data 65 definitions and redundancy. The number and function of the 
storage space available on the network volume. If manage- various modules is a matter of design choice and are noted 
ment rules allow more data files to be migrated from a here simply to illustrate the invention. 
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When the sweep operation is initiated at step 601 at the be lost in the renaming activity. However, data file attributes 
predetermined time, the operations kernel 501 in storage are preserved as part of a data file renaming procedure, 
server processor 51 accesses at step 602, via network When a data file rename occurs, the name ascribed to this 
interface 502, data communication link 11 and network data file is modified and the entry in the network directory 
interface 503, the data file directory 511 that is stored in 5 is suddenly placed in a different part of the file system 
memory associated with file system manager 521 in file primary storage directory. The data file attributes are trans- 
server 41. The contents of all the network volumes stored in ported in unmodified form with the new data file name and, 
data storage device 31 which is part of file server 41 are since the placeholder is part of the data file attributes, the 
listed in directory 511. File system manager 521 typically newly renamed data file attributes still point to the correct 
manages directory 511, which lists the data file, its storage 10 secondary storage directory entry and the rename is thereby 
location and attributes. Operations kernel 501 at step 603 transferred to the secondary storage directory automatically, 
orders all the data files in each managed network volume in Thus, the virtual segment of the file system automatically 
a predetermined manner into a priority list, such as a least tracks the renaming of the data files in the primary segment 
recently used list. The bottom entries of the list represent the of the file system. 

present migration candidate set. The migration candidates 15 The migrated data file is received by the storage server 50 

are selected based on a number of data file attributes, such and written at a selected available data storage space in a 

that the set of management candidates are of sufficient extent migration volume of a data storage device 61 in level one 

to provide sufficient free data storage space to satisfy the free 311 of the secondary storage 52. In addition, if shadow 

space objectives for this managed network volume. In volumes 65 are provided in the secondary storage 52 for data 

addition, these management candidates have been inactive ^ reliability purposes, the migrated data file is also written at 

for a period of time greater than a minimum inactive period. step 608 into selected available data storage space on one of 

The device manager 504 of storage server 50 is activated the shadow volumes 65. Groups of data files stored on the 
at step 604 by operations kernel 501 and at step 605 sweeps shadow volumes 65 are also periodically dumped after a 
the migration candidates from the selected managed network period of sweep activity has occurred at step 609 via a 
volume, transmits and assembles them into a transfer unit 25 special backup drive 71 on to backup media element 72 to 
within the top layer 311 in the secondary storage 52. FIG. 5 ensure disaster recovery capability. To more efficiently man- 
illustrates the migrated data file path through the data age data files in the hierarchy, the operations kernel 501 can 
storage management system. In particular, the migration assemble a plurality of data files into a transfer unit of 
candidate data file is selected by the operations kernel 501 predetermined size for continued migration to lower levels 
and removed from the managed volume of data storage 30 in the hierarchy. A candidate size for the transfer unit is a 
device 31, after transmitting the data file via network inter- standard object size for the media that is used to implement 
face 503, the data communication link 11 of network 1 and the first layer 311 of the secondary storage 52. It is desirable 
network interface 502 to the storage server 50 and checking that the transfer units that are used in the secondary storage 
that the data file has been transferred correctly. Storage 52 fit into all media with minimum boundary fragmentation, 
server 50 thus writes the transfer unit containing the trans- 35 The data files that are written to the migration volumes 61 
ferred data file and other data files to level 1 (311) of the and shadow volumes 65 have their physical storage location 
secondary storage 52. identification written into a secondary storage directory 

The data file is listed in the directory 511 of the network owned by the storage server 50. This directory can be 
volume on which the processor 21 has written the data file. implemented entirely within the storage server 50, but 
This directory listing is modified by the operations kernel 40 would take up a great deal of data storage space and be 
501 at step 606 to enable the processor 21 to obtain the data diflicult to protect. Instead, this directory is distributed 
file whether it is stored on the managed volume in the among the file servers 41-43 that contain managed volumes 
network volume or on a volume in the secondary storage 52. 31-33 for the processors 21, 22, with each piece of the 
This is accomplished by the operations kernel 501 providing directory representing the secondary storage directory 531 
a "placeholder entry" in the directory 511 of the managed 45 for the managed volume on the primary data storage device 
volume. This entry lists the data file as having an extent of 31-33. The placeholder entry in the file server 41-43 points 
"0" and data is provided in the directory attributes or to this directory entry in the secondary storage directory 531. 
metadata area for the data file that points to the catalog entry, Thus, the processor 21 that requests access to this migrated 
created at step 607 by systems services 505, in the secondary data file can obtain the requested data file without being 
storage directory 531 that lists the storage location in the 50 aware of the existence of the secondary storage 52. This is 
secondary storage 52 that contains the migrated data file. accomplished (as described in detail below) by the storage 
The directory of the location of a particular data file in service agent 121, which obtains placeholder entry from the 
secondary storage 52 is maintained in the network volume file server directory 511, which points to the directory entry 
itself . This is accomplished by the use of a secondary storage in the secondary storage directory 531. This identified 
directory 531 that is maintained in file server 41 by the 55 directory entry in the secondary storage directory 531 con- 
operations kernel 501 and systems services 505 of storage tains the address in the migration volume that contains the 
server 50. The directory 511 and secondary storage directory requested data file. 

531 can both be written on the data storage device 31 of file This data file migration procedure is replicated within the 

server 41. secondary storage 52 for each layer of the hierarchical data 

The use of a key or pointer in the placeholder entry to 60 storage. Thus, as each layer of the secondary storage 52 

indicate the secondary storage directory entry for the becomes utilized in excess of a predetermined threshold, the 

requested data file is preferably accomplished by storing the data files are relocated to the next lower layer of the data 

key as part of the data file attributes in the file system. This storage hierarchy. 

enables both the placeholder entry and the secondary storage The particular segmentation of the storage server 50 

directory to survive data file renaming activity on the part of 65 illustrated herein between operations kernel 501, device 

the requesting processor. File systems commonly rename manager 504 and system services 505 represents but one of 

data files and, if the key were part of the file name, it would a number of possible implementations of the functionality 
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provided by storage server 50. It is anticipated that other objects (e.g. data files) that move together to the backup 

divisions of responsibility among these elements or other system and through the hierarchy, with each transfer unit 

combinations of elements are possible without departing being assigned a unique identification within the data man- 

from the concepts embodied in this description. agement system. 

5 As noted above, the operations kernel 501 of the storage 

File Systems server processor 51 orders data files in each managed 

Tbc data management system makes use of a file system volume f ^JHcsy**™ 41^3 according to a predeter- 

structute that provides a common repository for the poten- mmed ^onthm. tte ordering can be based on data file 

tially diverse file systems of the client file servers 41-43. us , a ^' content, cntical.ty, size or whatever other criteria >s 

TTie file system structure of the data management system » select ,f F ° r ^tJS^. ° f -^T ^ ™ ple . leaSt 

must not only accept the data files from me file servers " C6n ?L f J ^ * u ! op , eratlons 

y11 A ^ . , ' . M _ .« t nira n r$ A n *„ ^ nra „ a A a * a kernel 501 orders the data files in each managed volume on 

41-43, but must also serve the backend data storage, data TnTJ , . . t . 4 _. . , - 4 

j , . ° an LRU basis and the entries on the bottom of the list 

recall, data backup, data relocate and disaster recovery \ * , ZJ ' , , e Z. 

functions that are inherent in the data management system, represent migration candidates. The operaUons kernel 501 

wherein the media used for these functions can vary widely. « Periodically sweeps the migration candidate data files from 

™ . j u the managed volumes and assembles them serially by man- 

The media can be an update in place media, such as j i • * * • • i ? 

magnetic disk, or can have only "append" capabilities, such * ged ™ lum f m f * ^Jf containm B a . of 

* • * m. j * isi * c * * 11 i data files. The full data file name is entered into the see- 
as magnetic tape. The data file transfers are typically large , . 4 - 

4 4 j *u u*u*j«ui 1a t 1 4 ondary storage directory 531, together with data file location 

in extent and must be such that data backup and data relocate . - 3 , 4 / , & , A .... , - 

, f j • «= • » t 1 in information: the location of the data file within the transfer 

operations can be performed in an eincieat manner. Typical M . 4 . . , . 

of file system architecture is a common DOS file system, ™ l, > I r 01111 lden lflca,I0n ' media identification 

whose organization is illustrated in FIG. 13. This file system ™e data file name is dways logically related to the original 

has four basic com onents* transfer umt identification, the data file is never moved to 

. another transfer unit, but remains in the transfer unit with the 

1. File naming convention. ^ other temporally related data files from each virtual file 

2. Directory architecture, to organize data files by name so system at the time of migration to secondary storage 52. The 
they may be easily located. media object is itself associated with transfer units, not data 

3. Physical space allocation scheme that relates data file files. In this manner, one directory is used to note the 
names to physical location on a data storage media, and correspondence between data files and transfer unit and a 
which allows data storage space to be utilized and reclaimed 30 second directory is used to note the correspondence between 
when data files are deleted. transfer units and media object. When transfer units are 

4. File management scheme, including access methods. relocated from one media to another, the data file directory 
For example, DOS data files are named with a 1-8 byte need not be updated since the data files remain in the original 
name and a 0-3 byte extension, which are delimited by a transfer unit and it is simply the change in location of the 
(nnnnnnnn.xxx). The directory architecture is illustrated in 35 transfer unit on the media that must be noted. 

FIG. 13 and takes the form of a hierarchical tree of directory The storage server processor 51 may not have sufficient 

names. The root is typically a volume, from which a number data files to completely fill a transfer unit within a reasonable 

of directories branch. Each directory includes other direc- period of time. The storage server processor 51 writes a 

tories and/or data files. A full data file name is represented partial transfer unit to the secondary storage 52 and the 

by concatenating all the directory tree structure components 40 backup media 82 upon the completion of the migration, 

from the root to the particular data file, with components When additional migrated data files are received from the 

being delimited by "V. An example of such a data file name file servers, the storage server processor 51 appends the 

using this convention is "\vol\dir\dir3\filename.ext". Each partially filled transfer unit with a complete transfer unit that 

DOS volume on the file server has its own unique file comprises the previously written partial transfer unit with 

system. The physical space allocation on the data storage 45 the additional received data files that completely fill the 

media is accomplished by the use of a File Allocation Table transfer unit. The storage server processor 51 tracks the 

(FAT). The data storage space on a DOS volume is seg- partial nature of the transfer unit. The use of the partial 

mented into allocation units termed clusters. All directory transfer unit write process reduces the window of vulner- 

and data file names in the volume are listed in the file ability since migrated data files are written to backup media 

allocation table and hierarchically related by linkages 50 on a periodic and timely basis and without rewriting 

between parents and children in the directory tree. When a unchanged data. 

data file name in entered into the file allocation table, space This file system separates the logical allocation of data 

is also provided for data file attributes such as hidden or storage from the physical storage allocation, with the logical 

read-only, and the identification of the first cluster used to allocation for all layers of the data storage hierarchy being 

store the data file is also noted. If additional clusters are 55 the same since the data file remains in its unique transfer 

required to store this data file, these clusters are finked in a unit. One significant advantage of this system is that when 

chain via pointers, with the entire chain representing the transfer units are migrated from layer to layer in the hier- 

physical location of the data file on the data storage media. archy or placed on a backup media, only the relationship 

between transfer unit identification and media object need be 

Irans 60 updated to reflect the new media on which this transfer unit 

The data management system of this invention makes use is stored. Furthermore, the data file retains its relationship to 

of a different directory structure to manage the storage of the transfer unit in the backup system, and the backup media 

data files on the data storage media of the secondary storage simply provides a redundant media object for the same 

52. The storage and relocation of data files among the transfer unit identification. The transfer unit is then written 

various layers of the secondary storage 52 is simplified by 65 into the first layer 311 of the secondary storage 52. This 

the use of transfer units. A transfer unit represents a block of procedure is used to relocate transfer units from one layer in 

data of predetermined size which contain virtual file system the data storage hierarchy to the next lower layer in the data 
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storage hierarchy. The block diagram of FIG. 11 illustrates 
the nested nature of the transfer units. Thus, the transfer unit 
of data files from the primary storage represents a data block 
of a first extent. The second layer transfer unit, assembled to 
relocate data files from the first layer of the hierarchical data 
storage to the second layer, can be composed of a plurality 
of first layer transfer units. Similarly, this process can be 
applied to successive layers of the data storage hierarchy. 
FIG. 11 illustrates the resultant stream of data that is written 
on to the lowest layer of the data storage hierarchy for a three 
layer secondary storage, consisting of a plurality of sequen- 
tially ordered second layer transfer units, each of which is 
comprised of a plurality of first layer transfer units. 

An alternate form of file system is illustrated in FIG. 12, 
wherein the physical allocation system is overlaid on the 
particular media type and hierarchy layer. Media at each 
layer of the data storage hierarchy is allocated in transfer 
units termed chunks for this approach, which have variable 
size, up to a predetermined maximum. If the underlying 
physical space allocation management permits, the chunks 20 
start small and grow according to need. Otherwise the 
chunks are pre-allocated in fixed size blocks and filled as 
needed. Only the data files from a particular network volume 
are stored in a selected chunk or plurality of chunks (chunk 
set) at each layer of the data storage hierarchy. Thus, the 25 
chunk set at a given layer represents the portion of the virtual 
file system that is stored at that layer. The block diagram of 
FIG. 12 illustrates the nested nature of the chunks. Thus, the 
chunk of data files from the primary storage represents a data 
block of a first extent, containing data files from only a 30 
single network volume. The second layer chunk assembled 
to relocate data files from the first layer of the hierarchical 
data storage to the second layer can be composed of a 
plurality of first layer chunks. Similarly, this process can be 
applied to successive layers of the data storage hierarchy. 35 
FIG. 12 illustrates the resultant stream of data that is written 
on to the lowest layer of the data storage hierarchy for a three 
layer secondary storage, consisting of a plurality of sequen- 
tially ordered second layer chunks, each of which is com- 
prised of a plurality of first layer chunks. 4 ° 

Reconfiguration of Layers in the Hierarchy 

The number and configuration of the layers of the hier- 
archy can be dynamically altered to suit the needs of the 
user. Additional layers can be added to the hierarchy or 45 
deleted therefrom. In addition, data storage capacity can be 
added or deleted from any selected layer of the hierarchy by 
the inclusion or exclusion of data storage devices from that 
selected layer. The data storage management system auto- 
matically adapts to such modifications of the hierarchy in a 50 
manner that ensures maximum performance and reliability. 
The shelf layer that is implemented by the backup drive 71 
and the mountable backup data storage element 72 can 
provide an overflow capacity for the first layer 311 of the 
secondary storage 52 if no additional layers are provided, or 55 
for the lowest layer 313 if multiple layers are provided. 
Thus, when there is no longer any available data storage 
space on the lowest layer of the hierarchy, transfer units or 
media units are deleted from this layer. If additional data 
storage capacity in the form of additional data storage 60 
devices are added to this layer, or alternatively, an additional 
layer of media is provided below the previously lowest layer 
of media, the deleted transfer or media units can be returned 
to the hierarchy from the backup mountable data storage 
elements 72. This is accomplished by the storage server 51 65 
noting the presence of newly added available data storage 
space on the lowest layer of the hierarchy and previously 
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deleted transfer or media units. The storage server 51 
accesses the media object directory to identify the location 
of the deleted data and retrieve this data from an identified 
backup mountable data storage element 72, which is 
mounted on backup drive 71. This retrieved data is then 
written on to the newly added media in available data 
storage space. This process is also activated if a data storage 
device is removed from a layer of the media or added to a 
layer of the media. If this media modification occurs in any 
but the lowest layer, the deleted transfer units or media 
objects are retrieved from the backup mountable data stor- 
age element 72 and stored on the same layer as they 
originally were stored unless insufficient space is available 
on that layer, in which case they are stored on the media 
level immediately below the level on which the data storage 
device was removed. 



Data File Recall 

As illustrated in flow diagram form in FIG. 8 and with 
reference to the system architecture in FIG, 7, a data file 
recall operates in substantially the reverse direction of data 
file migration. As noted above, the data files that are written 
to the migration volumes 61 and shadow volumes 65 have 
their physical storage location identification written into a 
secondary storage directory 531 in the file server 41. The 
placeholder entry in directory 511 on the file server 41 points 
to this secondary storage directory entry. Thus, the processor 
21 at step 801 requests access to this migrated data file and 
this request is intercepted at step 802 by a trap or interface 
711 in the file server 41. The trap can utilize hooks in the file 
system 41 to cause a branch in processing to the storage 
server agent 121 or a call back routine can be implemented 
that allows the storage server agent 121 to register with the 
file system 41 and be called when the data file request is 
received from the processor 21. In either case, the trapped 
request is forwarded to storage server agent 121 to deter- 
mine whether the requested data file is migrated to second- 
ary storage 52. This is accomplished by storage server agent 
121 at step 803 reading directory 511 to determine the 
location of the requested data file. If a placeholder entry is 
not found stored in directory 511 at step 805, control is 
returned to the file server 41 at step 806 to enable the file 
server 41 to read the directory entry that is stored in 
directory 511 for the requested data file. The data stored in 
this directory entry enables the file server 41 to retrieve the 
requested data file from the data storage device 31 on which 
the requested data file resides. If at step 805, storage server 
agent 121 determines, via the presence of a placeholder 
entry, that the requested data file has been migrated to 
secondary storage 52, storage server agent 121 at step 807 
creates a data file recall request and transmits this request 
together with the direct access secondary storage pointer key 
stored in the placeholder entry via network 1 to storage 
server 50. At step 808, operations kernel 501 uses systems 
services 505 which uses the pointer key to directly retrieve 
the entry in secondary storage directory 531. This identified 
entry in the secondary storage directory 531 contains the 
address in the migration volume that contains the requested 
data file. The address consists of the transfer unit identifi- 
cation and position of the data file in the transfer unit. The 
device manager 504 uses the data file address information to 
recall the requested data file from the data storage device on 
which it is stored. This data storage device can be at any 
level in the hierarchy, as a function of the activity level of the 
data file. Device manager 504 reads the data file from the 
storage location in the data storage device identified in the 
secondary storage directory 531 and places the retrieved 
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data file on the network 1 for transmission to the file server 
41 and volume 31 that originally contained the requested 
data file. Systems services 505 of operations kernel 501 then 
updates the secondary storage directory 531 and the direc- 
tory 511 to indicate that the data file has been recalled to the 5 
network volume. At step 811, control is returned to file 
server 41, which reads directory 511 to locate the requested 
data file. The directory 511 now contains information that 
indicates the present location of this recalled data file on data 
storage device 31. The processor 21 can then directly access 10 
the recalled data file via the file server 41. 

Disaster Recovery 

There are a number of techniques used to protect the 
integrity of data files in the data management system of the 35 
present invention. In addition, primary storage backups are 
typically implemented to stream data files from each net- 
work volume on to a backup device (not shown). Within the 
data management system, the sweep routine produces data 
file streams, that represent a mixture of data files from the 
network volumes, which data are not only written to transfer 
units on to the data storage media of the first layer of 
secondary storage 52, but are also written from the data 
storage media of the first layer of secondary storage 52 on 
to backup media 72 on a backup device 71. Furthermore, this 25 
data is replicated on shadow volumes 65. The backup 
process periodically backs up the transfer units that are 
written on to the first layer of the secondary storage, even if 
the transfer units are only partially filled. If the backup 
media 72 is rotated off-site, a number of backup media 72 
will contain various transfer units, each at a different level of 
completion. Each time a backup media 72 is mounted on 
backup device 71, operations kernel 501 and device man- 
ager 504 cooperate to update any partially filled transfer 
units to the present level of completion to ensure that the 
backup media reflects the present state of the system. 

A further level of data protection is provided as described 
above by the backup subsystem. When a media unit on the 
third layer 313 is filled, the contents of this media unit can 40 
be copied to a backup tape to construct a duplicate media 
unit termed the media replacement unit. This provides 
duplicate copies of the media unit and should the media unit 
stored on the third layer 313 fail, the media replacement unit 
provides full redundancy of all the data stored therein. The 45 
media replacement units are typically stored in an off-site 
repository to provide physical separation of the media in the 
event of fire or other possible event that could destroy or 
damage the media stored on-site. Thus, if a media failure 
occurs, the media replacement unit can be loaded in a library 5Q 
device in the system to immediately provide the data files, 
rather than having to stream this data from one media to 
another. 

In addition, the secondary storage directory 531, since it 
is distributed on network volumes, is backed up on to the 55 
primary storage backup media as noted above. This meta- 
data can also be optionally replicated into a data storage 
device of the secondary storage or backed up on to the 
backup media 72. 

We claim: 60 

1. A data storage management system for a data network 
which functions to interconnect a plurality of file servers, 
each of which stores data files, said data storage manage- 
ment system comprising: 

directory means, located in each of said plurality of file 65 
servers, for identifying a storage location of each data 
file stored on said file server; 
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secondary storage means, comprising a multi-layer hier- 
archical memory, wherein said layers in said multi- 
layer hierarchical memory comprise media of differing 
characteristics, for storing data files migrated from said 
plurality of file servers; 

storage server means connected to said network for auto- 
matically managing transfer of data files between said 
plurality of file servers and said secondary storage 
means, comprising: 

means for migrating selected data files from said plurality 
of file servers to said secondary storage means; 

means for writing said migrated selected data files 
received from said plurality of file servers into selected 
available memory space in said multi-layer hierarchical 
memory, absent reservation of memory space in said 
multi-layer hierarchical memory on a file server basis; 

means for writing in said directory means at a directory 
location for each of said migrated selected data files, 
data indicating that said migrated selected data file has 
been migrated to said secondary storage means and 
data identifying a physical data storage location in said 
storage server means for said migrated selected data 
file, which physical data storage location contains data 
indicative of a locus in said multi-layer hierarchical 
memory which contains said migrated selected data 
file; 

means for collecting a plurality of data files, that are 
transmitted to said secondary storage means, into a 
transfer unit; and means for storing said transfer unit on 
a first layer of said hierarchy. 

2. The system of claim 1 wherein said means for storing 
data comprises: 

transfer unit directory means for storing data indicative of 
a correspondence between a data file and a transfer unit 
in which said data file is located; and 

media object directory means for storing data indicative 
of a correspondence between a transfer unit and a 
media on which said transfer unit is located. 

3. The system of claim 1 further comprising: 

means, located in each of said plurality of file servers, for 
intercepting a call at a selected file server to a data file 
that has been stored in said file server; 

means, responsive to said data written in said directory 
means indicating that said requested data file has been 
migrated to said secondary storage means, for recalling 
said requested data file from said secondary storage 
means to said file server, comprising: 

means for reading said data stored in said directory means 
to identify a physical data storage location in said 
storing means that contains data which identifies a 
locus in said secondary storage means of said requested 
migrated data file, 

means for retrieving said data stored in said identified 
physical storage location in said storing means, and 

means, responsive to said retrieved data, for transmitting 
said requested migrated data file from said locus in said 
secondary storage means to said selected file server. 

4. The system of claim 1 wherein said data, written by said 
writing means in said directory means indicating that said 
migrated selected data file has been migrated to said sec- 
ondary storage means, is stored as part of the data file 
attributes. 

5. The system of claim 1 wherein said means for migrat- 
ing data files comprises: 

means for ordering data files stored on said selected file 
server into a priority ordering by selected characteris- 
tics of said data files. 
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6. The system of claim 5 wherein each said file server 
contains a plurality of volumes of data storage, said means 
for migrating data files further comprises: 

means for reviewing each volume of said at least one file 
server to identify lowest priority data files stored 5 
thereon. 

7. The system of claim 6 wherein said means for migrat- 
ing data files further comprises: 

means for transmitting at least one of said identified 
lowest priority data files to said secondary storage 30 
means. 

8. The system of claim 7 wherein said storage server 
means further comprises: 

means for activating said data file migration means for 
successive lowest priority data files until available 
memory in a volume of said selected file server is at 
least as great as a predefined threshold. 

9. The system of claim 7 wherein said storage server 
means further comprises: ^ 

means for scheduling activation of said means for migrat- 
ing data files on a temporal basis. 

10. The system of claim 7 wherein said storage server 
means further comprises: 

means for activating said means for migrating data files as M 
a function of volume space utilization. 

11. The system of claim 7 wherein said storage server 
means further comprises: 

means for activating said means for migrating data files as 
a function of activity on said data network. 30 

12. The system of claim 6 wherein said means for 
migrating data files further comprises: 

means for copying at least one of said priority ordered 
data files from said selected file server to said second- 
ary storage means; and 35 

means responsive to a subsequent determination of insuf- 
ficient available data storage space on said selected file 
server for utilizing data storage space occupied by said 
copied at least one said priority ordered data files as 
available data storage space, 40 

13. The system of claim 1 wherein at least one of said 
layers comprises: 

a plurality of data storage elements for storing data files 

migrated from said file servers; 
at least one data storage element drive means for reading/ 

writing data on a data storage element mounted in said 

data storage element drive means; and 
automated data storage element management means for 

robotically mounting a selected one of said plurality of 5Q 

data storage elements in said data storage element drive 

means. 

14. In a data storage management system for a data 
network which functions to interconnect a plurality of file 
servers, each of which stores data files, a method of data 55 
storage management comprising the steps of: 

storing data in a directory, located in each of said plurality 
of file servers, for identifying a storage location of each 
data file stored on said file server; 

storing, in a secondary storage system, comprising a 60 
multi-layer hierarchical memory, wherein said layers in 
said multi-layer hierarchical memory comprise media 
of differing characteristics, data files migrated from 
said plurality of file servers; 

automatically managing transfer of data files between said 65 
plurality of file servers and said secondary storage 
system, comprising the step of: 



522 

20 

migrating selected data files from said plurality of file 
servers to said secondary storage system; 

writing said migrated selected data files received from 
said plurality of file servers into selected available 
memory space in said multi-layer hierarchical memory, 
absent reservation of memory space in said multi-layer 
hierarchical memory on a file server basis; 

writing in said directory at a directory location for each of 
said migrated selected data files, data indicating that 
said migrated selected data file has been migrated to 
said secondary storage system and data identifying a 
physical data storage location in said storage server for 
said migrated selected data file, which physical data 
storage location contains data indicative of a locus in 
said multi-layer hierarchical memory which contains 
said migrated selected data file; 

a multi-layer hierarchical memory, wherein said layers in 
said hierarchical memory comprise media of differing 
characteristics, said method further comprises: 

collecting a plurality of data files, that are transmitted to 
said secondary storage system, into a transfer unit; and 

storing said transfer unit on a first layer of said hierarchy. 

15. The method of claim 14 wherein said step of storing 
data comprises: 

storing, in a transfer unit directory, data indicative of a 

correspondence between a data file and a transfer unit 

in which said data file is located; and 
storing, in a media object directory, data indicative of a 

correspondence between a transfer unit and a media on 

which said transfer unit is located. 

16. The method of claim 14 further comprising: 
intercepting, in each of said plurality of file servers, a call 

at a selected file server to a data file that has been stored 
in said file server; 

recalling, in response to said data written in said directory 
indicating that said requested data file has been 
migrated to said secondary storage system, said 
requested data file from said secondary storage system 
to said file server, comprising the steps of: 

reading said data stored in said directory to identify a 
physical data storage location in said memory that 
contains data which identifies a locus in said secondary 
storage system of said requested migrated data file, 

retrieving said data stored in said identified physical 
storage location in said memory, and 

transmitting, in response to said retrieved data, said 
requested migrated data file from said locus in said 
secondary storage system to said selected file server. 

17. The method of claim 14 wherein said data, written in 
said directory indicating that said migrated selected data file 
has been migrated to said secondary storage system, is 
stored as part of the data file attributes. 

18. The method of claim 14 wherein said step of migrating 
data files comprises: 

ordering data files stored on said selected file server into 
a priority ordering by selected characteristics of said 
data files. 

19. The method of claim 18 wherein each said file server 
contains a plurality of volumes of data storage, said step of 
migrating data files further comprises: 

reviewing each volume of said at least one file server to 
identify lowest priority data files stored thereon. 

20. The method of claim 19 wherein said step of migrating 
data files further comprises: 

transmitting at least one of said identified lowest priority 
data files to said secondary storage system. 
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21. The method of claim 20 further comprising: 
activating said data file migration step for successive 

lowest priority data files until available memory in a 
volume of said selected file server is at least as great as 
a predefined threshold. 

22. The method of claim 20 further comprising: 
scheduling activation of said step of migrating data files 

on a temporal basis. 

23. The method of claim 20 further comprising: 
activating said step of migrating data files as a function of 

volume space utilization. 

24. The method of claim 20 further comprising: 
activating said step of migrating data files as a function of 

activity on said data network. 
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25. The method of claim 19 wherein said step of migrating 
data files further comprises: 

copying at least one of said priority ordered data files from 
said selected file server to said secondary storage 
system; and 

utilizing, in response to a subsequent determination of 
insufficient available data storage space on said 
selected file server, data storage space occupied by said 
copied at least one said priority ordered data files as 
available data storage space. 
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